stackelberg value
Is Knowledge Power? On the (Im)possibility of Learning from Strategic Interactions
When learning in strategic environments, a key question is whether agents can overcome uncertainty about their preferences to achieve outcomes they could have achieved absent any uncertainty. Can they do this solely through interactions with each other? We focus this question on the ability of agents to attain the value of their Stackelberg optimal strategy and study the impact of information asymmetry. We study repeated interactions in fully strategic environments where players' actions are decided based on learning algorithms that take into account their observed histories and knowledge of the game. We study the pure Nash equilibria (PNE) of a meta-game where players choose these algorithms as their actions.
- Europe > Austria > Vienna (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Maryland (0.04)
- (12 more...)
- Energy (0.46)
- Education > Educational Setting > Online (0.45)
- Government (0.45)
- North America > United States > New York > Tompkins County > Ithaca (0.04)
- North America > Canada (0.04)
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.93)
Strategizing against No-regret Learners
Deng, Yuan, Schneider, Jon, Sivan, Balusubramanian
How should a player who repeatedly plays a game against a no-regret learner strategize to maximize his utility? We study this question and show that under some mild assumptions, the player can always guarantee himself a utility of at least what he would get in a Stackelberg equilibrium of the game. When the no-regret learner has only two actions, we show that the player cannot get any higher utility than the Stackelberg equilibrium utility. But when the no-regret learner has more than two actions and plays a mean-based no-regret strategy, we show that the player can get strictly higher than the Stackelberg equilibrium utility. We provide a characterization of the optimal game-play for the player against a mean-based no-regret learner as a solution to a control problem. When the no-regret learner's strategy also guarantees him a no-swap regret, we show that the player cannot get anything higher than a Stackelberg equilibrium utility.
- North America > United States > New York > Tompkins County > Ithaca (0.04)
- North America > Canada (0.04)
- Europe > Austria > Vienna (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Maryland (0.04)
- (12 more...)
- Energy (0.46)
- Education > Educational Setting > Online (0.45)
- Government (0.45)
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.93)
- North America > United States > New York > Tompkins County > Ithaca (0.04)
- North America > Canada (0.04)